Genomics, Proteomics & Bioinformatics — Latest Matching Preprints

1

ExMODE: A Multi-Omics Repository for Extremophile Adaptation and Bioprospecting

Li, D.; Ma, K.; Zhang, Y.; Wang, J.; Cui, Z.; Li, X.; Wang, W.; Tong, J.; Guo, Y.; Wang, Z.; Zeng, P.; Wang, J.; Xu, X.; Zhang, N.; Zhang, Y.; Chen, J.; Hu, Q.; Yang, W.; Li, Z.; Yang, T.; Du, W.; Xu, Z.; Yue, Z.; Wang, J.; Fan, G.; Zhang, W.; Xu, X.; Huo, L.; Wei, X.; Meng, L.; Liu, S.

2026-04-29 microbiology 10.64898/2026.04.27.720953 medRxiv

Top 0.1%

40.9%

Show abstract

Extreme environments, though hostile to most life forms, host specialized extremophile communities that have redefined biological cognition and emerged as vital biotechnological resources, with their unique adaptive traits and bioactive molecules driving advances in multiple scientific and industrial fields. However, research on extremophiles is hindered by limitations in culture-based methods, fragmented multi-omics data with non-uniform annotation standards across repositories, the lack of cross-extreme comparative research in existing resources, and the singularity of data dimensionality that neglects key structural information, all of which restrict the functional interpretation of extremophile microbes and the exploitation of their bioprospecting potential. To tackle these challenges, we developed ExMODE (https://db.genomics.cn/exmode/), a comprehensive multi-omics database platform dedicated to extremophiles. It centrally integrates multi-omics data from diverse extreme habitats with a standardized annotation framework, resolving data fragmentation and enabling systematic cross-environment comparative analyses to elucidate extremophile adaptive mechanisms. Moreover, ExMODE aggregates multi-dimensional datasets including genes, genomes, secondary metabolite sequences and protein structures, overcoming the constraints of single-dimensional data and significantly improving the efficiency of biotechnological resource discovery from extreme microorganisms.

2

Grass Expression Atlas: an RNA-seq-based expression resource for grass species.

Kambara, K.; Chen, Q.; Tsugama, D.

2026-03-16 bioinformatics 10.64898/2026.03.13.711518 medRxiv

Top 0.1%

37.4%

Show abstract

Grass Expression Atlas (GExA) is an interactive web-based resource for rapid exploration of gene expression across diverse tissues, developmental stages, and conditions in grass species. GExA integrates publicly available RNA sequencing (RNA-seq) datasets for four millets: pearl millet (Cenchrus americanus), foxtail millet (Setaria italica), proso millet (Panicum miliaceum), and finger millet (Eleusine coracana), and includes barley (Hordeum vulgare) and sorghum (Sorghum bicolor) as reference species. Datasets were processed using a unified processing workflow to generate expression values in transcripts per million (TPM). The current release comprises 4,673 samples from 442 BioProjects, including 987 pearl millet samples and 2,216 foxtail millet samples, and is provided through a user-friendly web interface. GExA is designed for scalable expansion to additional species via the pipeline used in this study. GExA is freely available at https://webpark2116.sakura.ne.jp/RNADB.

3

A comprehensive DNA methylation atlas for the Chinese population through nanopore long-read sequencing of 106 individuals

Li, Y.; Jiang, T.; Qian, L.; Wang, Y.

2026-04-23 genomics 10.64898/2026.04.20.719515 medRxiv

Top 0.1%

22.3%

Show abstract

DNA methylation constitutes the primary epigenetic language mediating organismal phenotypic plasticity. Establishing a cohort-level genomic methylation landscape featuring wide geographical diversity is fundamental for dissecting its genetic and environmental attributes. Leveraging nanopore sequencings strength in genome-methylome co-sequencing, we generated a whole-genome, haplotype-resolved methylation atlas for 106 individuals from 19 provinces across China. The atlas identified 27,609,354 CpG sites genome-wide, with notably more informed gene proximal regions and CpG islands compared to whole-genome bisulfite sequencing. Detailed analyses revealed genomic structural variants as a pervasive covariate of DNA methylation, with a remarkable 2-fold compensation effect found in genome-wide heterozygous deletions. On the other hand, habitat altitude is found to be a strong environmental determinant of DNA methylation. We established a quantitative relationship between altitude and methylation states and identified a gene set strictly responsive to altitude differences, revealing epigenetically regulated genes such as PRDM16, EPHB2 and WNT7A. The methylation atlas provides a reference resource to facilitate further explorations into human epigenetics.

4

In Silico Structure-Based Interactomic Analysis of the Scaffolding Protein DCAF7

mezghrani, a.; Reys, V.; Labesse, G.

2026-05-15 bioinformatics 10.64898/2026.05.13.724911 medRxiv

Top 0.2%

21.9%

Show abstract

WD40 domains share a widespread {beta}-propeller fold, and often act as versatile scaffold proteins. Despite their central role in organizing dynamic cellular complexes, the molecular and structural mechanisms of many WD40 proteins remain poorly understood. Among them, DCAF7, an ubiquitously expressed and essential gene in human, also encodes a highly conserved WD40 protein in eukaryotic organisms. It is known to interact with multiple and functionnally diverse partners to coordinates cellular activity of several protein kinases as well as transcriptional regulators, thereby modulating key cellular processes such as cell growth, differentiation, and transcriptional regulation. However, the precise mode of action of DCAF7 is unknown and its important divergence in sequence from better characterize WD40 prevent information transfer by similarity. Structural interactomic can reveal how protein-protein interactions (PPIs) occur within an organism and are essential for understanding biological functions and developing new therapeutic strategies. Using SLiMAn2, AlphaFold2/3 and PSSMsearch, we identified a conserved -helical short linear motif (SLiM) in several well known DCAF7 partners that binds to the top surface of its {beta}-propeller. This motif was subsequently used to generate a regular expression, to identify potential new direct binders across the DCAF7 meta-interactome and the human proteome. Domain-domain interactions were also predicted for some other partners. Finally, modeling of oligomeric complexes with such new hits reveals the structural basis of DCAF7 scaffolding, with links to neurodevelopmental disorders such as autism.

5

CellClick: an interactive platform for adjustable and accurate cell type annotation in single-cell and spatial omics data

Shi, L.; Dai, M.; Zhang, Y.-b.; Wu, S.; Wang, M.; Wang, X.-j.

2026-06-03 bioinformatics 10.64898/2026.06.01.727775 medRxiv

Top 0.2%

18.6%

Show abstract

Single-cell omics and spatial omics technologies are nowadays widely used in biological and medical research. In both single-cell and spatial omics data analysis, accurate cell type annotation is a key step for downstream analysis and scientific discoveries. However, high-quality cell annotation usually requires multiple rounds of manual analysis for result refinement, which poses great challenges to most researchers. Here, we present CellClick, an interactive platform for convenient and accurate cell type annotation in single-cell and spatial omics data. CellClick provides Data Preprocessing, Data Visualization, Cell Annotation, Annotation Validation, and Cell Reannotation modules, which facilitate automatic or user-guided cell selection and annotation. The feasibility of using CellClick to generate more accurate cell annotation results was exemplified by both scRNA-seq and spatial transcriptomics data.

6

Eleven deep-sea coral genome assemblies unveil insights into evolution, adaptation, and coral biodiversity

Zhang, N.; Li, L.; Ta, K.; Shi, C.; Seim, I.; Zhang, Y.; Zhang, W.; Cui, Z.; Xiang, X.; Jia, L.; Ge, Q.; Du, M.; Xie, T.; Ji, Q.; Yue, Z.; Fan, G.; Liu, S.; Meng, L.

2026-05-07 genomics 10.64898/2026.05.06.723128 medRxiv

Top 0.2%

18.6%

Show abstract

Deep-sea corals are vital in maintaining coral ecosystem biodiversity, yet their genetic characteristics remain largely unexplored. Here, we present 11 deep-sea coral genome assemblies, including four Hexacorallia and seven Octocorallia species, significantly contributing new genomic information across two orders. Our analysis reveals the historical dynamics of coral speciation and the influence of environmental factors on the evolution of coral reef ecosystems.Total of 126 horizontal gene transfer (HGT) events were detected, among which genes from the ancestor of symbiodiniaceae indicate that the ancestors of deep-sea corals may have inhabited shallow-sea environments. Notably, several of these HGTs are involved in phosphorus (PhnX/PhnW) and cholesterol (DHCR7) metabolisms within corals, indicating that HGTs may serve as an adaptive survival strategy for the coral holobionts. Deep-sea corals also rely on symbiotic bacteria to synthesize 10 essential amino acids (such as valine and tyrosine), retaining only partial amino acid synthesis capacity. In addition, we investigated the evolution of key biological rhythm genes and temperature adaptation in corals. The loss of key rhythm genes (e.g., clock and cry) in deep-sea corals and copy number difference of genes related to heat stress (e.g., Cbl-b and Rchy) revealed genetic difference between deep-sea and shallow-sea corals. Our new genome assemblies enhance the understanding of deep-sea coral evolution, biodiversity, and adaptation, providing a genetic foundation for coral conservation.

7

Chromosome-scale genome of the woody oilseed crop sacha inchi elucidates the molecular basis of alpha-linolenic acid biosynthesis and triacylglycerol accumulation in seeds

Pan, B.-Z.; Zhang, X.; Hu, X.-D.; Fu, Q.; Chen, M.-S.; Tao, Y.-B.; Niu, L.-J.; He, H.; Shen, Y.; Cheng, Z.; Lang, T.; Liu, C.; Xu, Z.-F.

2026-03-20 genomics 10.64898/2026.03.18.712556 medRxiv

Top 0.2%

18.2%

Show abstract

Sacha inchi (Plukenetia volubilis L.) is an emerging woody oilseed crop prized for its high alpha-linolenic acid (ALA) content. Despite its nutritional and economic value, the lack of high-quality genomic resources has hindered genetic improvement and the elucidation of its unique polyunsaturated fatty acid and lipid biosynthetic pathways. In this study, we report a high-quality, chromosome-scale genome assembly of sacha inchi with a total length of 710.62 Mb, integrated from Illumina, PacBio, and chromosome conformation capture (Hi-C) technology. The genome harbors 37,570 protein-coding genes, and 379.86 Mb (53.45%) of repetitive sequences. Phylogenomic analysis reveals that sacha inchi diverged from its closest relative Ricinus communis, [~] approximately 36.2 million years ago. Comparative genomics indicates that sacha inchi experienced only ancient whole genome duplication events. To elucidate the mechanisms governing ALA biosynthesis and triacylglycerol (TAG) accumulation in sacha inchi seeds, we performed temporal transcriptome profiling across six seed development stages. Our findings demonstrate that high TAG content is primarily driven by the sustained expression of biosynthetic genes and low activity of degradation genes during mid-to-late seed development. Notably, while genes encoding stearoyl-ACP desaturases (SADs) maintain the precursor pool, the expression of genes encoding fatty-acid desaturase 2 (FAD2) and fatty-acid desaturase 3 (FAD3) is positively correlated with the final accumulation of C18:2 and C18:3 fatty acids. We also identified lncRNAs as potential epigenetic regulators of these key pathways. This high-quality genome provides a critical foundation for elucidating the molecular mechanisms of seed growth and development in sacha inchi.

8

Complete Telomere-to-Telomere Assembly of the Y Chromosome in the Chinese Quartet

Wang, B.; Wan, S.; Zhang, P.; Zhang, Y.; Wang, X.; Dong, L.; Ye, K.; Yang, X.

2026-04-16 genomics 10.64898/2026.04.13.718326 medRxiv

Top 0.3%

16.9%

Show abstract

The complete assembly of the human Y chromosome remains a challenge due to its highly repetitive and complex structure. While complete telomere-to-telomere (T2T) assemblies have been generated for a few individuals, such high-quality resources for East Asian populations, particularly for well-characterized multi-omics reference cohorts, are still scarce. The Chinese Quartet, comprising monozygotic twin daughters and their parents, is a premier reference material for genomic studies, yet a T2T-level Y chromosome assembly for this pedigree was lacking. Here, we present a complete, gapless T2T assembly of the Y chromosome (designated CQ-chrY) from the father of the Chinese Quartet. This assembly was generated by integrating Oxford Nanopore ultra-long reads, PacBio HiFi reads, and Hi-C data, resulting in a sequence of 61.88 Mb. The assembly shows exceptional base accuracy (QV = 51.09) and structural completeness (GCI = 100; CRAQ AQI = 95.217). We completely resolved the 33.52 Mb Yq12 heterochromatic region and annotated 164 protein-coding genes and 51.03 Mb (82.47%) of repetitive sequences. This CQ-chrY assembly represents the third complete Chinese Y chromosome and fills the last gap in the T2T assemblies of the Quartet family, providing an invaluable paternal haplotype resource for expanding East Asian genomic standards and for studies on Y chromosome structural variation and evolution.

9

Histone modifications analysis reveals enhancers reprogramming during maternal-to-zygotic transition

Hu, K.; Wang, C.; Fang, D.; Lu, J.; Meng, X.; Chen, L.; Yao, Y.; Guo, J.; Khan, S.; Li, W.; Wang, Y.; li, Y.; Chen, H.; Xu, J.

2026-05-09 developmental biology 10.64898/2026.05.06.723106 medRxiv

Top 0.3%

16.7%

Show abstract

Enhancers are key epigenetic regulatory elements that orchestrate spatiotemporal gene expression and are critical in mammalian development, gene regulation, and disease. Histone modifications such as H3K4me1 (a canonical enhancer mark) and H3K27ac (which distinguishes active enhancers) remain poorly characterized during early mammalian embryogenesis. Using low-input CUT&RUN (Cleavage Under Targets and Release Using Nuclease) with input as low as 50 cells, this study profiles genome-wide H3K4me1 and H3K27ac patterns in mouse oocytes and pre-implantation embryos. Both marks are enriched in distal regions and exhibit distinct sequence preferences and reprogramming dynamics in pre-implantation embryos. H3K27ac is reprogrammed at the 2-cell stage and marks active enhancers, while H3K4me1 is remodeled at the 4-cell stage and co-localizes with H3K27ac, overlapping with accessible chromatin regions. Interestingly, the co-localization of H3K4me1 and H3K27ac is also detected in promoter regions, where they exhibit a mutually exclusive pattern with H3K4me3. Three enhancer types-active (H3K4me1/H3K27ac), primed (H3K4me1), and poised (H3K4me1/H3K27me3)-are dynamically remodeled during maternal-to-zygotic transition (MZT), with active enhancers increasing significantly after zygotic genome activation. Furthermore, genome-wide super-enhancers are identified and mainly enriched in promoters. The differences in gene expression at different stages may be related to the specific motifs enriched by super-enhancers.

10

Comprehensive bioinformatic analysis reveals novel potential diagnostic biomarkers associated with monocytes in osteoporosis

Qin, X.; Wen, B.; He, P.; Chen, Z.; Tan, S.; Mao, Z.

2026-03-24 genetics 10.64898/2026.03.20.713320 medRxiv

Top 0.3%

15.1%

Show abstract

Osteoporosis affects millions of women globally. In this study, we applied bioinformatics methods to screen for novel diagnostic biomarkers of osteoporosis in women using the GSE62402 and GSE56814 datasets. PCSK5, ZNF225, and H1FX were used to construct a diagnostic model. ROC, calibration, and decision curve analyses were performed to assess the diagnostic performance on the training (GSE56814) and external (GSE56815) datasets. The expression level of model genes was validated in GEO datasets. Furthermore, five transcription factors (ETS1, NOTCH1, MAZ, ERG, and FLI1) were identified as common upstream regulators of model genes. PCSK5, ZNF225, and H1FX serve as novel diagnostic biomarkers, providing new insights into the pathogenesis of and treatment strategies for osteoporosis in women.

11

Agent-Driven Validation of Oncology Therapeutic Targets

Huang, K.-l.; Accelerated Discovery with Agents (ADA) Consortium,

2026-05-03 genomics 10.64898/2026.04.29.721634 medRxiv

Top 0.3%

15.0%

Show abstract

Selecting the correct target is critical in drug development, yet systematic replication of published target claims is rarely performed. Here, we introduce a replication-focused AI agent framework to evaluate 31 gene target-disease hypotheses, including context-specific oncology targets from both retracted and non-retracted papers. Each target claim was translated into a zero-shot validation prompt executed by a biomedical research agent in one round, and all agent-driven analyses were validated and scored by domain expert. Compared to retracted targets (2/17 validated, 11.8%), non-retracted targets (9/14 validated, 64.3%) were 17-fold more likely to show context-specific dependency in agent-driven analyses. The replicated targets include WRN in microsatellite stable cancer, PRMT5 in MTAP-deleted cancer, as well as more recent discoveries such as PTGES3, HASPIN, SLC5A3, PKMYT1, FAM126B, and PAPSS1. These results demonstrate that agent-human collaboration can conduct data-driven validation at scale, improve target prioritization, and systematically reduce translational risk for drug development.

12

A graph-based pangenome reveals the genetic basis of climate-resilient and horticultural traits in pear

Gao, Y.; Wang, W.; Liu, Y.; Wu, J.; Wang, L.; Wei, J.; Dai, M.; Wei, C.; Tian, L.; Jiang, C.; Su, J.; Xue, H.; Liu, H.; Ni, J.; Jiang, S.; Cai, D.; Zheng, X.; Zhang, D.; Bai, S.

2026-05-12 plant biology 10.64898/2026.05.08.723691 medRxiv

Top 0.3%

14.8%

Show abstract

Climate change poses an increasing threat to the cultivation of deciduous fruit trees, placing greater demands on modern pear breeding. Using pear germplasm adapted to diverse environments, we assembled 11 chromosome-level genomes. In combination with 13 publicly accessible pear genomes, we analyzed presence-absence variations (PAVs) and constructed a graph-based pangenome for pear. By performing a PAV-eQTL analysis of the fruit of 123 pear accessions, we identified PAVs significantly associated with expression levels of genes that may be involved in regulating agronomic traits. Population analysis of 268 pear accessions revealed two stop-gained variants in DAM1 of independent origin, which may function in advancing the blooming date and reducing the chilling requirement. We detected complex PAVs at the NOR1 locus, including two copy-number variations and one deletion. These PAVs contributed to the rapid diversification of the NOR1 locus and the fruit development period through regulating ARF5 and other ripening-related genes. We revealed the selection history of the NOR1 locus and developed novel pear individuals that accumulated alleles for low chilling requirement, early blooming date, and short fruit development period. The results provide valuable resources for pear genomics research and offer a guideline for breeding modern pears with climate resilience.

13

Structure-based similarity network accelerates the discovery of lysins as oral microbiome modulators targeting periodontal pathogens

Yao, F.; He, J.; Nyaruaba, R.; Chen, F.; Zhou, J.; Yang, H.; Wei, H.; Li, Y.

2026-04-13 microbiology 10.64898/2026.04.11.717860 medRxiv

Top 0.3%

14.5%

Show abstract

Microorganisms significantly influence human health, and dysbiosis of the oral microbiome plays a critical role in the development and progression of both oral and systemic diseases. This highlights the urgent need for novel therapeutics targeting specific pathogens. Here, we presented a structure-based pipeline to efficiently identify potential phage-derived periodontal lysins (LysPds) from nearly one million proteins. We predicted the structures of candidate lysins using AlphaFold2 and developed an innovative structure-based similarity network to classify them into distinct clusters, each with unique functional properties. A systematic characterization of 16 representative LysPds from 11 superfamilies revealed that over 90% demonstrated potent antibacterial activity against key periodontal pathogens. Among these, LysPd078 was identified as a promising preclinical drug candidate, effectively reconfiguring microbiome communities while demonstrating significant efficacy and safety in mouse models of periodontitis and calvarial infection. Our findings highlight the effectiveness of structure-based similarity networks in exploring vast protein spaces and underscore the potential of LysPd078 as a targeted modulating agent for the oral microbiome.

14

The BraALA3 Homologs Mediate Propiconazole-Modulated Plant Growth in flowering Chinese cabbage

Gao, Q.; Song, Y.; Yang, Y.; Wang, S.; Ruan, X.; Liu, Z.; Guo, D.; Chen, Y.; Wang, X.; Chen, R.; Xu, H.; Lin, F.

2026-04-19 plant biology 10.64898/2026.04.16.718850 medRxiv

Top 0.3%

14.3%

Show abstract

In agriculture, propiconazole (PCZ) controls excessive growth in flowering Chinese cabbage but poses dietary safety risks due to residue accumulation. Therefore, identifying novel PCZ targets and breeding PCZ-free cultivars is critical for the safe production of flowering Chinese cabbage. Here, we identified three P4-ATPase flippase homologs aminophospholipid ATPase 3 (BraALA3a/b/c) in flowering Chinese cabbage that function as sensitive targets for PCZ. These proteins exhibit high binding affinity for PCZ, which directly inhibits their ATPase activity. Overexpression of the BraALA3 homologs enhanced plant growth and increased sensitivity to PCZ, whereas knockdown led to dwarfism and reduced sensitivity. Based on these findings, we identified editable active sites via protoplast-based screening. Genetic transformation of one such site yielded BraALA3a/braala3aK200T mutant lines, which displayed a dwarf and compact architecture. These findings provide a precise molecular target for developing PCZ-free germplasm in flowering Chinese cabbage through gene editing.

15

High-variance phenome database reveals important roles of WD40 proteins in the plant pathogenic fungus Fusarium graminearum

Choi, S.; Lee, N.; Jeon, H.; Park, J.; Kim, S.; Kim, J.-E.; Shin, J.; Moon, H.; Min, K.; Choi, Y.; Hwangbo, A.; Kim, H.; Choi, G. J.; Lee, Y.-W.; Song, D.-G.; Son, H.

2026-04-20 molecular biology 10.64898/2026.04.19.719521 medRxiv

Top 0.3%

14.2%

Show abstract

O_LIWD40 is a highly conserved protein domain in eukaryotes, playing a critical role in various cellular process. C_LIO_LIWe conducted genome-wide functional analysis of WD40 genes in Fusarium graminearum--a phytopathogenic fungus that causes severe yield loss and mycotoxin contamination in major cereal crops. C_LIO_LIComprehensive phenome analysis of 119 WD40 gene deletion mutants across 22 distinct phenotypic traits revealed phenotypic divergence within the phenome, establishing a strong correlation between virulence and sexual reproduction. Notably, 21 "core WD40 genes" were identified, offering valuable insights into divergent biological processes. C_LIO_LIPilot interactome studies of Fgwd101 and Fgwd133 provided further insights into their potential pathobiological functions. Our investigation contributes to broadening our knowledge of the biological mechanisms underlying fungal pathogenesis and may assist in the identification of targets for antifungal agents. C_LI

16

Spatial genome organization in nematodes with programmed DNA elimination

Simmons, J. R.; Xue, T.; McCord, R. P.; Wang, J.

2026-03-29 genomics 10.1101/2025.10.23.684251 medRxiv

Top 0.3%

14.2%

Show abstract

Programmed DNA elimination (PDE) is a notable exception to genome integrity, characterized by significant DNA loss during development. In many nematodes, PDE is initiated by DNA double-strand breaks (DSBs), which lead to chromosome fragmentation and subsequent DNA loss. However, the mechanism of nematode programmed DNA breakage remains largely unclear. Interestingly, in the human and pig parasitic nematode Ascaris, no conserved motif or sequence structures are present at chromosomal breakage regions (CBRs), suggesting the recognition of CBRs may be sequence-independent. Using Hi-C, we revealed that Ascaris CBRs engage in three-dimensional (3D) interactions before PDE, indicating that physical contacts between break regions may contribute to the PDE process. The 3D interactions are established in both Ascaris male and female germlines, demonstrating inherent genome organization associated with the CBRs and to-be-eliminated sequences. In contrast, in the unichromosomal horse parasite Parascaris univalens, transient pairwise interactions between neighboring CBRs that will form the ends of future somatic chromosomes were observed only during PDE. Intriguingly, we found that Ascaris PDE, which converts 24 germline chromosomes into 36 somatic ones, induces specific compartmentalization changes. Remarkably, Parascaris PDE generates the same set of 36 somatic chromosomes, and the 3D compartment changes following PDE are consistent between the two species. Overall, our findings suggest that CBRs spatially demarcate the retained and eliminated DNA and may contribute to their spatial organization during Ascaris PDE. We also demonstrated that the 3D genome reorganization of the somatic chromosomes in these nematodes following PDE is evolutionary and developmentally conserved.

17

Comprehensive characterization of Plasmodium vivax antigens using a high-density peptide array

Asawa, R.; Hazzard, B.; Tebben, K.; Tan, J.; Cantaert, T.; Berry, A. A.; Tolia, N. H.; Popovici, J.; Serre, D.

2026-03-18 microbiology 10.64898/2026.03.17.712326 medRxiv

Top 0.4%

14.1%

Show abstract

Plasmodium vivax is the second most prevalent Plasmodium species, with 2.5 billion people at risk of infection worldwide and around 10 million cases of clinical vivax malaria every year. Despite the clinical importance of this pathogen, very little is known about the P. vivax proteins recognized by the host immune system, which hinders our ability to select vaccine candidates or develop efficient serological markers. To comprehensively characterize immunogenic P. vivax proteins, we designed a high-density peptide array containing 4.2 million peptides covering the entire protein sequence of all P. vivax genes and analyzed antibody responses of infected and malaria-naive individuals. We identified a total of 283 proteins that are commonly immunogenic in symptomatic individuals. These proteins included most proteins known to be involved in erythrocyte invasion, a putative new invasion protein, several nucleoporins, and many uncharacterized proteins that should be further investigated for their roles during blood-stage infections. These analyses also revealed a unique pattern of antibody response against PIR proteins in asymptomatic individuals, that could be associated with protection against clinical vivax malaria. Overall, these data provide an agnostic and comprehensive perspective on immunogenic P. vivax proteins and constitute an important resource for the malaria community to develop new tools for better detecting and eliminating this important human pathogen.

18

An overexpression platform reveals the functional diversity of human KRAB-Zinc Finger Proteins in maintaining cellular homeostasis

Forey, R.; Raclot, C.; Dorschel, A.; Archambeau, J.; Planet, E.; Bompadre, O.; Offner, S.; Matsushima, W.; van der Goot, F. G.; Trono, D.

2026-04-22 genetics 10.64898/2026.04.20.718945 medRxiv

Top 0.4%

14.0%

Show abstract

Kruppel associated box zinc finger proteins (KZFPs) form the largest family of transcriptional regulators in mammals, yet most remain uncharacterized. Here we established a scalable framework to probe KZFP function. An arrayed inducible overexpression screen of 366 human KZFPs in K562 cells identified factors that alter cellular proliferation, enabling functional prioritization. Integrative transcriptomic, chromatin and proteomic analyses revealed diverse mechanisms, including transposable element-linked repression (ZNF43), promoter proximal regulation (ZNF257), and SCAN domain dependent transcriptional activation (ZNF498/ZSCAN25 and ZNF18). These results highlight the functional diversity of KZFPs and provide a strategy for their annotation.

19

Characterization of long non-coding RNAs during compatible and incompatible pollination in Arabidopsis thaliana

Patel, N.; Gawande, N. D.; Sankaranarayanan, S.

2026-05-01 plant biology 10.64898/2026.04.29.721561 medRxiv

Top 0.4%

13.7%

Show abstract

Long non-coding RNAs (lncRNAs) have emerged as critical players in plant development and stress responses, yet their involvement in pollination responses is largely unknown. To address this gap, we identified and characterized lncRNAs and their cis-acting, trans-acting, and miRNA-mediated regulatory interactions during both compatible and incompatible pollination in Arabidopsis thaliana. Leveraging publicly available datasets, we analyzed expression profiles at 10 and 60 minutes post-pollination. We identified 1,073 novel and 3,422 annotated lncRNAs, with 1,002 novel and 985 annotated, respectively, showing detectable expression after filtering. Differential expression analysis identified 12 lncRNAs at 10 min and 32 lncRNAs at 60 min post-pollination. Further investigation revealed 9 cis-targets, 112 trans-targets, and 144 miRNA-mediated regulatory interactions, many of which were enriched in pathways related to stress, defense, and self-incompatibility. Notably, the regulatory landscape is more active at 60 minutes than at 10 minutes post-pollination. These findings provide a robust framework and resource to facilitate future functional studies of lncRNAs during pollination.

20

Genomic Characterization of the Endangered Medicinal Polypore Agarikon (Laricifomes officinalis syn. Fomitopsis officinalis)

Bennett, P. I.; Bair, Z. J.; Bradshaw, A. J.; Stamets, P.

2026-06-04 genomics 10.64898/2026.06.01.728891 medRxiv

Top 0.4%

12.7%

Show abstract

Agarikon (Laricifomes officinalis syn. Fomitopsis officinalis) is an endangered fungus belonging to a unique lineage in the Polyporales (Basidiomycota) with a growing body of evidence supporting its medicinal value. In this study, we report the hybrid de novo assembly and annotation of the first L. officinalis nuclear and mitochondrial genome sequences, with a nuclear genome size of 28.76 Mb assembled across 66 scaffolds (51.96% GC content; BUSCO completeness of 99.4%), and a complete core mitochondrial genome size of 197.67 kb. Structural and functional annotation of the nuclear genome yielded 8,717 predicted genes including 8,604 protein-coding genes, with 310 genes in 27 biosynthetic gene clusters. We characterized the mating type loci matA and matB, consistent with a tetrapolar mating system, and identified genes encoding key enzymes involved in triterpenoid and polyketide biosynthetic pathways that lead to the production of a diverse array of secondary metabolites. Additionally, we conducted maximum likelihood phylogenomic analysis to confirm the taxonomic position of L. officinalis among 21 species in Polyporales using protein sequences for 860 shared BUSCO genes. This high-quality annotated genome of L. officinalis will serve as a foundation for further investigations into the evolutionary history of this distinct fungal lineage, provide a reference for future population genomic analyses, and elucidate mechanisms underlying the synthesis of the bioactive compounds responsible for agarikons wide-ranging medicinal benefits.